Lecture Notes: Introduction to R, RStudio and RMarkdown-V2.0
Henry Mendoza Rivera and Gloria R. Bautista Mendoza
August 26, 2020
RStudio
RStudio is an integrated development environment (IDE) for R. It includes a console, syntax-highlighting editor that supports direct code execution, as well as tools for plotting, history, debugging and workspace management. https://rstudio.com/
Rstudio Desktop installation
Step 1: Click on download RStudio.
Step 2: Click on Download RStudio Desktop
Step 3: Click on the file as indicated in \(\color{blue}{\text{1}}\) to download RStudio for Mac and click on the file as indicated on \(\color{blue}{\text{2}}\) for Windows
Step 4: Follow the installation steps. Accept all by default.
Rstudio Cloud
RStudio Cloud is a lightweight, cloud-based solution that allows anyone to do, share, teach and learn data science online
Go to R Studio Cloud
Click on Get Started for free
Click on Sign Up to create an account.
Sign in with your user and password
Once you are inside RStudio Cloud, Click on New project. Now, you can use RStudio cloud similar to how you use the RStudio Desktop.
RMarkdown
Markdown is a lightweight markup language with plain-text-formatting syntax, created in 2004 by John Gruber with Aaron Swartz. We will use RMarkdown to write Homework solutions and any other report document during the course.
Once you are in RStudio, then you can write the RMarkdown document required for your homework submission.
Creating a RMarkdown Document in RStudio Desktop
Step 1: Creating and organizing folders in the desktop
- Create a folder Stat371 in the Desktop.
- Create a subfolder Data inside the folder Stat371.
- Create a subfolder Graphs inside the folder Stat371.
Step 2: Set your working Directory
The working directory is a file path on your computer that sets the default location of any files you read into R. Set your working directory as follow
- Go to the menu and click on
Session. Then select Set Working Directory and click on Choose Directory
- Now go to your Desktop and select the folder Stat371 and click in
open
Step 3: Create a new Rmarkdown document
Open RStudio (check first if you already installed the software R and then RStudio. In case you do not have installed R and RStudio. Go to Modules>Orientation>Introduction-to-R.html.
Go to the menu and click on file. Then select New File and RMarkdown

- Select
Document, and HTML. Then complete the information about title and name.
Title: Homework 1
Name: your name and Last Name.

- Delete everything below \(\color{blue}{\text{## R Markdown }}\)

- Replace \(\color{blue}{\text{##R Markdown}}\) by \(\color{blue}{\text{# Situation 1}}\) or any other title.
Step 4: Write your document
Suppose that in your homework the first situation to solve looks like
Situation 1
Find and interpret the mean of the following data
160, 170, 175, 148, 175, 185, 190, 145, 162
Then you should proceed as follow
- Copy and paste the situation statement from your homework document below \(\color{blue}{\text{# Situation 1}}\). Then Write \(\color{blue}{\text{## Solution}}\)
- Insert a Chunk: Click in \(\color{blue}{\textbf{1}}\) and then click in \(\color{blue}{\textbf{2}}\). See figure below.

- Use the
c() function (where c mean “combine”) to combine numbers into a vector. Copy and paste the information below in your RMarkdown document
Mydata.weight = c(160, 170, 175, 148, 175, 185, 190, 145, 162) # weight in pounds
mean(Mydata.weight) # find the mean (average) of the weight
## [1] 167.7778

Run the code inside the chunk as follow
Mac: shift + command + Enter (Hit at the same time the key shift, command, and Enter)
Windows/Linux: Shift + Ctrl + Enter (Hit at the same time the key Shift, Ctrl, and Enter)
or go to \(\color{blue}{\textbf{1}}\) and then \(\color{blue}{\textbf{2}}\) as indicate in the graph below. It works for Mac and Windows
- Knit your document to get the html document
- Click on the Knit button. RStudio will open a window asking for the folder where should be save your document.
- Select in your desktop your folder Stat371 (do not get inside the folder, just select it) and click on Save.
- Write the interpretation
Interpretation: the average weight on a person is 167.8 lbs.
- Knit again to generate the new html file. The final document looks like
\(\color{red}{\Large\textbf{Knit }}\) again your document every time you add a new piece of information.
How to insert Images in RMarkdown file (html)
We can insert images in the document. To insert an image:
Place it in your folder Graphs
Outside a code chunk write:

- To add an alt text to your image, add it between the square brackets []:

How to insert Tables RMarkdown file (html)
Elaborate the below table using the RMarkdown tables generator.
| Dead |
10 |
17.65 |
24.59 |
20.79 |
2.22 |
19.74 |
20.59 |
21.72 |
| Live |
12 |
18.95 |
27.14 |
23.16 |
2.76 |
20.92 |
23.16 |
25.17 |
I recommend to use: https://www.tablesgenerator.com/markdown_tables to generate a table as below. The following will help you to edit the table.
Click in Column to insert a column
Click in Row to insert a row.
Insert the information in the cells
Click in Column->text Align. Then select the type of text align required
To delete a column click in Column->Remove
To delete a row click in Row->Remove
To Insert a LaTeX symbol, place in the cell and write the LaTeX expression. For example, $\mu$
Other type of table is the ANOVA Table
Elaborate the below table using the RMarkdown tables generator
| Treat (between) |
\(df_{Trt} = t-1\) |
SSTrt |
\(MSTrt =\frac{ SSTrt}{df_{Trt}}\) |
\(F =\frac{MSTrt}{MSE}\) |
\(p = P(F_{df_{Trt}, df_{E}} > F)\) |
| Error (within) |
\(df_{E} = N - t\) |
SSE |
\(MSE =\frac{SSE}{df_{E}}\) |
|
|
| Total |
\(df_{Tot} = N - 1\) |
SSTot |
|
|
|
Reading Data set in R
Before Reading a data set in R, make sure you create the Data folder as explained on the \(\color{blue}{\text{Creating a RMarkdown}}\) section. We are going to read into R the data set lbw
Data set lbw from the Low Birth Weight Study
Go to Canvas in Modules>Students Resources>lbw.csv and download the lbw data set and save in the subfolder Data inside the folder Stat371.
Set your Working Directory. The working directory is a file path on your computer that sets the default location of any files you read into R. Set your working directory as follow
- Go to the menu and click on Session. The select Set Working Directory and click on Choose Directory
+ Now go to your *Desktop* and select the folder *Stat371* and click in *open*
- Insert a Chunk: Click in \(\color{blue}{\textbf{1}}\) and then click in \(\color{blue}{\textbf{2}}\)

- Copy and paste the code below inside the Chunk
lbw<-read.csv2("Data/lbw.csv",sep=",", dec = ".")
Run the code as follow
Go to \(\color{blue}{\textbf{1}}\) and then \(\color{blue}{\textbf{2}}\) as indicate in the graph below. It works for Mac and Windows
- Check if the variables are read properly. We will use the R function str(). Add inside the chunk str(lbw) as follow.
lbw<-read.csv2("Data/lbw.csv",sep=",", dec = ".")
str(lbw)
## 'data.frame': 189 obs. of 10 variables:
## $ low : int 0 0 0 0 0 0 0 0 0 0 ...
## $ smoke: int 0 0 1 1 1 0 0 0 1 1 ...
## $ race : int 2 3 1 1 1 3 1 3 1 1 ...
## $ age : int 19 33 20 21 18 21 22 17 29 26 ...
## $ lwt : int 182 155 105 108 107 124 118 103 123 113 ...
## $ ptl : int 0 0 0 0 0 0 0 0 0 0 ...
## $ ht : int 0 0 0 0 0 0 0 0 0 0 ...
## $ ui : int 1 0 0 1 1 0 0 0 0 0 ...
## $ ftv : int 0 3 1 2 0 0 1 1 1 0 ...
## $ bwt : int 2523 2551 2557 2594 2600 2622 2637 2637 2663 2665 ...
or go to \(\color{blue}{\textbf{1}}\) and then \(\color{blue}{\textbf{2}}\) as indicate in the graph below. It works for Mac and Windows
Notice that R read the variables: low, smoke, race, ht,and ui, as a integer class. Let’s change them into factor class (categorical variables). To do this, we use the function as.factor as follow:
lbw<-read.csv2("Data/lbw.csv",sep=",", dec = ".")
lbw$low<-as.factor(lbw$low)
lbw$smoke<-as.factor(lbw$smoke)
lbw$race<-as.factor(lbw$race)
lbw$ht<-as.factor(lbw$ht)
lbw$ui<-as.factor(lbw$ui)
Now run again str(lbw)
## 'data.frame': 189 obs. of 10 variables:
## $ low : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
## $ smoke: Factor w/ 2 levels "0","1": 1 1 2 2 2 1 1 1 2 2 ...
## $ race : Factor w/ 3 levels "1","2","3": 2 3 1 1 1 3 1 3 1 1 ...
## $ age : int 19 33 20 21 18 21 22 17 29 26 ...
## $ lwt : int 182 155 105 108 107 124 118 103 123 113 ...
## $ ptl : int 0 0 0 0 0 0 0 0 0 0 ...
## $ ht : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
## $ ui : Factor w/ 2 levels "0","1": 2 1 1 2 2 1 1 1 1 1 ...
## $ ftv : int 0 3 1 2 0 0 1 1 1 0 ...
## $ bwt : int 2523 2551 2557 2594 2600 2622 2637 2637 2663 2665 ...
5. Installing packages in R
What is a Package in R?
An R package is a collection of functions and data sets developed by the R community.
A package will include R code and other programming languages and code from such as C++
For help about the package run the code
help(package = "ggplot2")
Where the packages are located?
Packages are located in
- CRAN
- Installing Packages From CRAN
How to install packages in R
For example, to use the ggplot2 package in R, follow the steps:
Step 1:
Install the package ggplot2. Use the R function install.packages() as follow (place the package name in quotes). Before running the code below, delete the # symbol in the below code. Then run the code
# install.package("ggplot2")
Step 2:
Load the package into R. with the R function library(). (Do not put the package name in quotes)
## Warning: package 'ggplot2' was built under R version 4.0.2
Should appear in the console a blue greater than symbol (\(>\)).
Step 3:
Add the # symbol to the code in step 1 (it becomes a green color the code line green in RMarkdown chunk)
Installing multiples packages
Run the below code without the #,install.packages(c("ggplot2","ggpubr")) in the first line and after you run it, again add the # to the first line #install.packages(c("ggplot2","ggpubr"))
#install.packages(c("ggplot2","ggpubr"))
library(ggplot2,ggpubr)